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DETAILED ACTION 
Drawings 

1 . The drawings are objected to because they are informal. Figures 1 to 3 contain 
handwritten elements. Also, Figure 3 should include words matching references 
numerals as in Figures 1 and 2. 

2. Corrected drawing sheets in compliance with 37 CFR 1 .121(d) are required in 
reply to the Office action to avoid abandonment of the application. Any amended 
replacement-drawing sheet should include all of the figures appearing on the immediate 
prior version of the sheet, even if only one figure is being amended. The figure or figure 
number of an amended drawing should not be labeled as "amended." If a drawing 
figure is to be canceled, the appropriate figure must be removed from the replacement 
sheet, and where necessary, the remaining figures must be renumbered and 
appropriate changes made to the brief description of the several views of the drawings 
for consistency. Additional replacement sheets may be necessary to show the 
renumbering of the remaining figures. Each drawing sheet submitted after the filing 
date of an application must be labeled in the top margin as either "Replacement Sheet" 
or "New Sheet" pursuant to 37 CFR 1.121(d). If the examiner does not accept the 
changes, the applicant will be notified and informed of any required corrective action in 
the next Office action. The objection to the drawings will not be held in abeyance. 



Application/Control Number: 10/051,462 Page 3 

Art Unit: 2654 

Claim Rejections • 35 USC § 102 

3. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

4. Claims 1, 3 to 13, 15 to 17, and 19 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Takagi ('057). 

Regarding independent claims 1 and 15, Takagi ('057) discloses a speech 
recognition method, program code, and device compensating for background noise, 
comprising: 

"providing a set of reference speech spectra" - reference pattern 3 is words or 
sentences of speech of a standard speaker that have been analyzed (column 5, lines 1 
to 5: Figure 1); 

"determining the reference speech spectral which correspond to the distorted 
short-term speech spectra" - an average value of the spectra of the noise regions of 
each of the input speech and the reference pattern is used; additive noise and channel 
distortion ("distorted short-term speech spectra") of the input speech is matched with 
those of the reference pattern (column 6, lines 13 to 17: Figure 1); noise conditions of 
additive noise and channel distortion of recognized input speech and those of the 
reference pattern are matched (column 8, lines 34 to 46); a reference pattern is 
analyzed and matched to feature vectors of the input speech (column 5, lines 5 to 18); 
implicitly, feature vectors represent "short-term speech spectra" because feature vectors 
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correspond to one frame of speech, which is the shortest time period for speech 
analysis; 

"estimating a frequency response taking into account both the distorted short- 
term speech spectra and the corresponding reference speech spectra" - spectral 
transforming portion 4 transforms the time sequence X(t) of the feature vectors of the 
input speech and the time sequence Y(t) of the feature vectors of the reference pattern 
into time sequences V(t) and W(t) of spectra; cepstra are transformed into spectra 
(column 5, lines 18 to 30: Figure 1); spectra represent "a frequency response" because 
a spectrum of speech gives an amplitude for each speech frequency; 

"compensating the distorted short-term speech spectra based on the estimated 
frequency response" - compensating portion 6 matches additive noise and channel 
distortion of the input speech with those of the reference pattern corresponding to 
Equations (1 1 ) and (1 3); compensation is performed by multiplying one of the reference 
pattern and the input speech by a predetermined channel distortion so that the average 
value of the speech pattern becomes equal to that of the input speech (column 8, lines 
4 to 21: Figure 1); here, multiplying the input speech by a predetermined channel 
distortion provides for "compensating the distorted short-term speech spectra"; 
Equations (1 1 ) and (13) are stated to be spectra of speech regions, so compensation is 
"based on the estimated frequency response." 

Regarding independent claims 17 and 19, Takagi ('057) further discloses a 
database for storing reference speech spectra because reference patterns 3 are 
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implicitly stored in a database element, as illustrated (Figure 1); additionally, a 
processor implicitly performs the method steps of the flowchart (Figure 1). 

Regarding claim 3, Takagi ('057) discloses compensating speech as a spectrum 
of the input speech and a reference pattern ("in the spectral domain"). 

Regarding claim 4, Takagi ('057) discloses that spectra of additive noise Bw and 
channel distortion Auj of a reference pattern are known (column 5, line 62 to column 6, 
line 12); spectra represent a frequency response, so the reference patterns are 
obtained "from speech data subject to a known frequency response". 

Regarding claims 5 and 7, Takagi ('057) discloses that additive noise and 
channel distortion of input speech is matched to those of the reference pattern (column 
6, lines 13 to 17); matching involves finding a closest reference pattern to input speech. 

Regarding claim 6, Takagi ('057) discloses stored reference patterns 3 for 
speech recognition (column 5, lines 1 to 5: Figure 1); implicitly, reference patterns are 
known in the art as "models". 

Regarding claims 8 and 13, Takagi ('057) discloses compensating a reference 
pattern by taking an average of input speech for regions of additive noise and channel 
distortion during preliminary matching 2 (column 6, lines 22 to 57: Figure 1). 

Regarding claim 9, Takagi ('057) discloses matching input speech and reference 
patterns by a matching error (column 6, lines 8 to 12: Figure 1); a matching error 
represents a difference between input speech and a reference pattern. 
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Regarding claim 10, Takagi ('057) discloses average vector calculating portion 5 
calculates the average vector of the time sequences of the spectra of the input speech 
(column 9, lines 3 to 8: Figure 1 ). 

Regarding claims 1 1 and 12, Takagi ('057) discloses using average values of 
spectra of input speech and reference patterns (column 6, lines 13 to 17: Figure 1); an 
average is calculated by summing over previous samples K n and K<p (column 6, lines 22 
to 57); averaging over a number of past samples is equivalent to "smoothing". 

Regarding claim 16, Takagi ('057) discloses a procedure described by a 
flowchart (Figure 1), which is implicitly performed on a digital signal processor, with a 
recording medium storing the instructions of the procedure. 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 2, 1 4, and 1 8 are rejected under 35 U.S.C. 1 03(a) as being unpatentable 
over Takagi ('057) in view of Takahashi. 

Concerning independent claim 14, Takagi ('057) discloses all the limitations, but 
does not expressly provide for "obtaining distorted speech spectra and analyzing the 
distorted speech spectra by means of a speech/nonspeech decision to filter out the 
distorted speech spectra that do not contain speech." In fact, however, Takagi ('057) 
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discloses storing predetermined speech regions and noise regions of reference patterns 
(column 5, lines 6 to 9), and using average values of speech and noise regions of the 
input speech (column 6, lines 13 to 17). Thus, while Takagi ('057) does not expressly 
disclose a speech/nonspeech decision filter to filter out distorted speech spectra that do 
not contain speech, implicitly, there must be a speech/nonspeech detector to decide 
which regions are speech regions and which regions are noise regions. Those skilled in 
the art know that a voice activity detector (VAD) ("a speech/nonspeech decision filter") 
is a common element for making speech/nonspeech decisions for a variety of purposes 
in speech processing. Specifically, Takahashi teaches noise suppression for removing 
noise from voice, where a voice/nonvoice discriminator 32 judges whether a voice 
signal separated into frames is voice or non-voice. The objective is to estimate a noise 
spectrum during silent periods so as to subtract a noise spectrum from a distorted 
speech spectrum and thereby correct a distorted speech spectrum to eliminate noise 
(Column 7, Line 38 to Column 8, Line 1 1 : Figure 4) It would have been obvious to one 
having ordinary skill in the art to analyze distorted speech with a speech/nonspeech 
decision as taught by Takahashi in the method of removing noise during speech 
recognition of Takagi ('057) for the purpose of estimating a noise spectrum during silent 
periods so that noise may be eliminated. 

Concerning claim 2, similar considerations apply. 

Concerning claim 18, Takahashi discloses first spectrum memory 36a and 
second spectrum memory 36b for temporarily storing prior frames of speech spectra 
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(column 7, lines 51 to 61), which are equivalent to "a buffer", a common expedient 
implicit in speech processing. 

7. Claim 20 is rejected under 35 U.S.C. 103(a) as being unpatentable over Takagi 
('057) in view of Brown et al. 

Takagi ('057) discloses all of the limitations, omitting only "a distributed speech 
recognition system" having "a network server with central speech recognition means." 
However, distributed speech recognition with a client/server architecture and central 
speech recognition on a server are commonly known because more computationally 
intensive speech recognition activities may be performed on a server to minimize the 
computational requirements of a client. Specifically, Brown et al. teaches an acoustic 
speech recognizer system and method, where a phone browser 12 connects to speech 
recognition server 34. (Column 2, Line 23 to Column 3, Line 8: Figures 1 and 2) Brown 
et al. states an advantage of a speech recognizer system that has a barge-in detector 
discriminating between speech and noise, and does not need a push-to-talk command. 
(Column 1 , Lines 35 to 56) It would have been obvious to one having ordinary skill in 
the art to incorporate a speech recognition apparatus of Takagi ('057) into a distributed 
speech recognition system with a central speech recognition server as suggested by 
Brown et al. for the purpose of eliminating a need for a push-to-talk button. 
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Conclusion 

8. The prior art made of record and not relied upon is considered pertinent to 
Applicants' disclosure. 

Gong ('843), Hirayama, Gong ('842), Bruckner et al., Boll et al., Porter, Ponting et 
al., Cerisara et al., and Yamaguchi et al. disclose related art. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (571) 272- 
7608. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 703- 
872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
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you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-21 7-91 97 (toll-free). 
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